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Abstract 


The performance and strategies used in electron reconstruction and selection at CMS 
are presented based on data corresponding to an integrated luminosity of 19.7 fb^^, 
collected in proton-proton collisions at = 8 TeV at the CERN LHC. The paper 
focuses on prompt isolated electrons with transverse momenta ranging from about 
5 to a few 100 GeV. A detailed description is given of the algorithms used to clus¬ 
ter energy in the electromagnetic calorimeter and to reconstruct electron trajectories 
in the tracker. The electron momentum is estimated by combining the energy mea¬ 
surement in the calorimeter with the momentum measurement in the tracker. Bench¬ 
mark selection criteria are presented, and their performances assessed using Z, Y, 
and J/t/7 decays into e++e^ pairs. The spectra of the observables relevant to electron 
reconstruction and selection as well as their global efficiencies are well reproduced 
by Monte Carlo simulations. The momentum scale is calibrated with an uncertainty 
smaller than 0.3%. The momentum resolution for electrons produced in Z boson de¬ 
cays ranges from 1.7 to 4.5%, depending on electron pseudorapidity and energy loss 
through bremsstrahlung in the detector material. 
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1 Introduction 

Electron reconstruction and selection is of great importance in many analyses performed us¬ 
ing data from the CMS detector, such as standard model precision measurements, searches 
and measurements in the Higgs sector, and searches for processes beyond the standard model. 
These scientific analyses require excellent electron reconstruction and selection efficiencies to¬ 
gether with small misidentification probability over a large phase space, excellent momentum 
resolution, and small systematic uncertainties. A high level of performance has been achieved 
in steps, evolving from the initial algorithms for electron reconstruction developed in the con¬ 
text of online selection [[T|. The basic principles of offline electron reconstruction, outlined in 
the CMS Physics Technical Design Report [[2[ [3, rely on a combination of the energy mea¬ 
sured in the electromagnetic calorimeter (ECAL) and the momentum measured in the tracking 
detector (tracker), to optimize the performance over a wide range of transverse momentum 
(Pt)- Throughout the paper, "energy" and "momentum" refer, respectively, to the energy of the 
electromagnetic shower initiated by the electron in the ECAL and to the track momentum mea¬ 
surement in the tracker, while the term "electron momentum" is used to refer to the combined 
information. The energy calibration and resolution in the ECAL were discussed in Ref. |^, and 
general issues in track reconstruction in Ref. |5]. Preliminary results on electron reconstruction 
and selection were also given in Refs. [[SHSl. One of the main challenges for precise reconstruc¬ 
tion of electrons in CMS is the tracker material, which causes significant bremsstrahlung along 
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1 Introduction 


the electron trajectory. In addition, this bremsstrahlung spreads over a large volume due to the 
CMS magnetic field. Dedicated techniques have been developed to account for this effect El- 
These procedures have been optimized using simulation, and commissioned with data taken 
since 2009. 

This paper describes the reconstruction and selection algorithms for isolated primary electrons, 
and their performance in terms of momentum calibration, resolution, and measured efficien¬ 
cies. The results are based on data collected in proton-proton collisions at = 8 TeV at the 
CERN LHC that correspond to an integrated luminosity of 19.7fb^^. Figure Mshows the two- 
electron invariant mass spectrum from data collected with dielectron triggers. The step near 
40GeV is due to the thresholds used in the triggers. The ]/ip, tp{2S), Y(1S), the overlapping 
Y (2S) and Y(3S) mesons, and the Z boson resonances can be seen, and are used to assess the 
performance of the electron momentum calibration and resolution, and to measure the recon¬ 
struction and selection efficiencies. 



Figure 1: Two-electron invariant mass spectrum for data collected with dielectron triggers. 
Electron momenta are obtained by combining information from the tracker and the ECAL. 

A crucial and challenging process used as a benchmark in the paper is the decay of the Higgs 
boson into four leptons through on-shell Z boson and virtual Z boson (Z*) intermediate states |9j 
In the case of a decay into four electrons or two muons and two electrons, one electron can have 
a very small pj that requires good performance down to px ~ 5 GeV. At the other extreme, elec¬ 
trons with px above a few 100 GeV are often used to search for high-mass resonances lUOl and 
other new processes beyond the standard model. 

The paper is organized as follows. Sections and [^briefly describe the GMS detector, the on¬ 
line selections, the data, and Monte Garlo (MG) simulations used in this analysis. The electron 
reconstruction algorithms, together with the performance of the electron-momentum calibra¬ 
tion and resolution, are detailed in Section The different steps in electron selection, namely 
the identification and the isolation techniques, are described in Section Measurements of 
reconstruction and selection efficiencies and misidentification probabilities are presented in 
Section!^ and results are summarized in Section 
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2 CMS detector 

The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diam¬ 
eter, providing a magnetic field of 3.8 T. The field volume contains a silicon pixel and strip 
tracker, a lead tungstate crystal ECAL, and a brass and scintillator hadron calorimeter (HCAL), 
each one composed of a barrel and two endcap sections. Muons are measured in gas ionization 
detectors embedded in the steel flux return yoke outside of the solenoid. Extensive forward 
calorimetry complements the coverage provided by the barrel and endcap detectors. A more 
detailed description of the CMS detector together with a definition of the coordinate system 
and relevant kinematic variables can be found in Ref. IHTH . In this section, the origin of the 
coordinate system is at the geometrical centre of the detector, however, in all later sections, un¬ 
less otherwise specified, the origin is defined to be the reconstructed interaction point (collision 
vertex). 

The tracker and the ECAL, being the main detectors involved in the reconstruction and iden¬ 
tification of electrons, are described in greater detail in the following paragraphs. The HCAL, 
which is used at different steps of electron reconstruction and selection, is also described below. 

The CMS tracker is a cylindric detector 5.5 m long and 2.5 m in diameter, equipped with silicon 
that provides a total surface of 200 m^ for an active detection region of |?/| < 2.5 (the accep¬ 
tance). The inner part is based on silicon pixels and the outer part on silicon strip detectors. 
The pixel tracker (66 million channels) consists of 3 central layers covering a radial distance r 
from 4.4 cm up to 10.2 cm, complemented by two forward endcap disks covering 6 < r < 15 cm 
on each side. With this geometry, a deposition of hits in at least 3 layers or disks per track for 
almost the entire acceptance is ensured. The strip detector (9.3 million channels) consists of 10 
central layers, complemented by 12 disks in each endcap. The central layers cover radial dis¬ 
tances r < 108cm and |z| < 109 cm. The disks cover up to \z\ < 280cm and r < 113cm. Since 
the tracker extends to \f]\ = 2.5, precise detection of electrons is only possible up to this pseu¬ 
dorapidity, despite the larger coverage of the ECAL. In this paper the acceptance of electrons is 
restricted to |/y| < 2.5, corresponding to the region where electron tracks can be reconstructed 
in the tracker. 

A consequence of the presence of the silicon tracker is a significant amount of material in front 
of the ECAL, mainly due to the mechanical structure, the services, and the cooling system. Eig- 
ure[^ shows the thickness of the tracker as a function of rj in the \r]\ < 2.5 acceptance region, 
presented in terms of radiation lengths Xq t5|. It rises from f»0.4Xo near \ri\ % 0, to %2.0Xo 
near \t]\ ^ 1.4, and decreases to fiil.4Xo near \ri\ % 2.5. This material, traversed by electrons 
before reaching the ECAL, induces a potential loss of electron energy via bremsstrahlung. The 
emitted photons can also convert to e+e^ pairs, and the produced electrons and positrons can 
radiate photons through bremsstrahlung, leading to the early development of an electromag¬ 
netic shower in the tracker. 

The ECAL is a homogeneous and hermetic calorimeter made of PbW04 scintillating crystals. 
It is composed of a central barrel covering the pseudorapidity region \i]\ < 1.479 with the 
internal surface located at r = 129 cm, and complemented by two endcaps covering 1.479 < 
\}]\ <3.0 that are located at z = ±315.4 cm. The large density (8.28 g/cm^), the small radiation 
length (0.89 cm), and the small Moliere radius (2.3 cm) of the PbW04 crystals result in a compact 
calorimeter with excellent separation of close clusters. A preshower detector consisting of two 
planes of silicon sensors interleaved with a total of 3 Xq of lead is located in front of the endcaps, 
and covers 1.653 < \r]\ < 2.6. 

The ECAL barrel is made of 61200 trapezoidal crystals with front-face transverse sections of 
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3 Data and simulation 


CMS Simulation 



Figure 2: Total thickness of tracker material traversed by a particle produced at the centre of 
the detector expressed in units of Xq, as a function of particle pseudorapidity rj in the \rj\ < 
2.5 acceptance region. The contribution to the total material of each of the subsystems that 
comprise the CMS tracker is given separately for the pixel tracker, strip tracker consisting of 
the tracker endcap (TEC), the tracker outer barrel (TOB), the tracker inner barrel (TIB), and the 
tracker irmer disks (TID), together with contributions from the support tube that surrounds the 
tracker, and from the beam pipe, which is visible as a thin line at the bottom of the figure |5J. 

22 X 22 mm^, giving a granularity of 0.0174 in t] and 0.0174 rad in (p, and a length of 230 mm 
(25.8 Xq). The crystals are installed using a quasi-projective geometry, with each one tilted by 
an angle of 3° relative to the projective axis that passes through the centre of CMS, to minimize 
electron and photon passage through uninstrumented regions. The crystals are organized in 
36 supermodules, 18 on each side of ?/ = 0. Each supermodule contains 1 700 crystals, covers 
20 degrees in (p, and is made of four modules along ?/. This structure has a few thin uninstru¬ 
mented regions between the modules at \t] \ =0, 0.435, 0.783,1.131, and 1.479 for the end of the 
barrel and the transition to the endcaps, and at every 20° between supermodules in (p. 

The ECAL endcaps consist of a total of 14 648 trapezoidal crystals with front-face transverse 
sections of 28.62 x 28.62mm^, and lengths of 220 mm (24.7Xo). The crystals are grouped in 
5x5 arrays. Each endcap is separated into two half-disks. The crystals are installed within a 
quasi-projective geometry, with their main axes pointing 1300 mm in z beyond the centre of 
CMS (-1300 mm for the endcap at z > 0), resulting in tilts of 2 to 8° relative to the projective 
axis that passes through the centre of CMS. 

The HCAL is a sampling calorimeter, with brass as the passive material, and plastic scintillator 
tiles serving as active material, providing coverage for \t]\ < 2.9. The calorimeter cells are 
grouped in projective towers of granularity 0.087 in t] and 0.087 rad in (p in the barrel, and 0.17 
in rj and 0.17rad in (p in the endcaps, the exact granularity depending on \rj\. A more forward 
steel and quartz-fiber hadron calorimeter extends the coverage up to \t]\ < 5.2. 


3 Data and simulation 

The data sample corresponds to an integrated luminosity of 19.7 fb~^ 1121 , collected at y/s = 
8 TeV. The results take advantage of the final calibration and alignment conditions of the CMS 
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detector, obtained using the procedures described in Refs. 

The first level (LI) of the CMS trigger system, composed of specially designed hardware pro¬ 
cessors, uses information from the calorimeters and muon detectors to select events of interest 
in 3.6 jis. The high-level trigger (HLT) processor farm decreases the event rate from about 
100 kHz (LI rate) to about 400 Hz for data storage ITTl . 

The electron and photon candidates at LI are based on ECAL trigger towers defined by ar¬ 
rays of 5 X 5 crystals in the barrel and similar but more complex arrays of crystals in the end- 
caps. The central trigger tower with largest transverse energy Ej = E sin(0), together with its 
next-highest adjacent Ej tower form a LI candidate. Requirements are set on the energy dis¬ 
tribution among the central and neighbouring towers, on the amount of energy in the HCAL 
downstream the central tower, and on the Ej of the electron candidate. The HLT electron candi¬ 
dates are constructed through associations of energy in ECAL crystals grouped into clusters (as 
discussed in Section |4.1| around the corresponding LI electron candidate and a reconstructed 
track with direction compatible with the location of ECAL clusters. Their selection relies on 
identification and isolation criteria, together with minimal thresholds on Ej. The identification 
criteria are based on the transverse profile of the cluster of energy in the ECAL, the amount of 
energy in the HCAL downstream the ECAL cluster, and the degree of association between the 
track and the ECAL cluster. The isolation criterion makes use of the energies that surround the 
HLT electron candidate in the tracker, in the ECAL, and in the HCAL. 

The electron triggers, corresponding to the first selection step of most analyses using electrons, 
require the presence of at least one, two or three electron candidates at LI and HLT. Table 
shows the lowest unprescaled LI and HLT Ej thresholds. 


Table 1: Lowest, unprescaled Ej threshold values in 

GeV used for the LI and HLT single-. 

double- and triple-electron triggers. 

Single 

Double 

Triple 

LI 

20 

13,7 

12,7,5 

HLT 

27 

17,8 

15, 8 ,5 


The performance of electron reconstruction and selection is checked with events selected by 
the double-electron triggers. These are mainly used to collect electrons from Z boson decays, 
but also from low-mass resonances, usually at a smaller rate. To study efficiencies, two addi¬ 
tional dedicated double-electron triggers are introduced to maximize the number of Z —> e+e^ 
events collected without biasing the efficiency of one of the elections. Both triggers require a 
tightly selected HLT electron candidate, and either a second looser HLT electron or a cluster in 
the ECAL, that together have an invariant mass above 50 GeV. Einally, studies of background 
distributions and misidentification probabilities are performed using events with Z —)■ e^e^ 
or Z —> decays that contain a single additional jet misidentified as an electron, the latter 

also using triggers with two relatively high-px muons. 

Several simulated samples are exploited to optimize reconstruction and selection algorithms, to 
evaluate efficiencies, and to compute systematic uncertainties. The reconstruction algorithms 
are tuned mostly on simulated events with two back-to-back electrons with uniform distribu¬ 
tions in t] and 'pj, with 1 < pj < 100 GeV. Simulated Drell-Yan (DY) events, corresponding 
to generic quark + antiquark —> Z/ 7 * —> e+e^ production, are used to study various recon¬ 
struction and selection efficiencies. Results from the MadGraph 5.1 I1T4II and powheg HTsHTtI 
generators are compared to evaluate systematic uncertainties. These programs are interfaced to 
PYTHIA 6.426 lllSl for showering of partons and for jet fragmentation. The PYTHIA tune Z2* fl^ 
is used to generate the underlying event. 
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4 Electron reconstruction 


Pileup signals caused by additional proton-proton interactions in the same time frame of the 
event of interest are added to the simulation. There are on average approximately 15 recon¬ 
structed interaction vertices for each recorded interaction, corresponding to about 21 concur¬ 
rent interactions per beam crossing. 

The generated events are processed through a full GeANT 4-based ||20l |2T1 detector simulation 
and reconstructed with the same algorithms as used for the data. A realistic description of the 
detector conditions (tracker alignment, ECAL calibration and alignment, electronic noise) is 
implemented in the simulation. In addition, for some specific tasks requiring a more precise 
understanding of the detector, a run-dependent version of the simulation is used to match the 
evolution of the detector response with time observed in data. This run-dependent simulation 
includes the evolution of the transparency of the crystals and of the noise in the ECAL, and 
accounts in each event for the effect of energy deposition from interactions in a significantly 
increased time window relative to the one containing the event of interest. 


4 Electron reconstruction 

Electrons are reconstructed by associating a track reconstructed in the silicon detector with a 
cluster of energy in the ECAL. A mixture of a stand-alone approach |3l| and the complementary 
global "particle-flow" (PE) algorithm Il2^l2^ is used to maximize the performance. 

This section specifies the algorithms used for clustering the energy deposited in the ECAL, 
building the electron track, and associating the two inputs to estimate the electron properties. 
Most of these algorithms have been optimized using simulation, and adjusted during data tak¬ 
ing periods. A large part of the section is dedicated to the estimation of electron momentum, 
the chain of momentum calibration, and the performance of the momentum scale and resolu¬ 
tion. 

4.1 Clustering of electron energy in the ECAL 

The electron energy usually spreads out over several crystals of the ECAL. This spread can 
be quite small when electrons lose little energy via bremsstrahlung before reaching ECAL. Eor 
example, electrons of 120 GeV in a test beam that impinge directly on the centre of a crystal 
deposit about 97% of the energy in a 5x5 crystal array 11241 . Eor an electron produced within 
CMS, the effect induced by radiation of photons can be large: on average, 33% of the electron 
energy is radiated before it reaches the ECAL where the intervening material is minimal {r] ^ 
0 ), and about 86% of its energy is radiated where the intervening material is the largest (|?/| ~ 
1.4). 

To measure the initial energy of the electron accurately, it is essential to collect the energy of 
the radiated photons that mainly spreads along the (p direction because of the bending of the 
electron trajectory in the magnetic field. The spread in the tj direction is usually negligible, 
except for very low px (pi ^ 5GeV). Two clustering algorithms, the "hybrid" algorithm in 
the barrel, and the "multi-5x5" in the endcaps, are used for this purpose and are described in 
the following paragraphs. Eor the clustering step, the tj and (p directions and Ej are defined 
relative to the centre of CMS. 

The hybrid algorithm exploits the geometry of the ECAL barrel (EB) and properties of the 
shower shape, collecting the energy in a small window in tj and an extended window in (p |2]. 
The starting point is a seed crystal, defined as the one containing most of the energy deposited 
in any considered region, that has a minimum Ej of Ej Arrays of 5 x 1 crystals 

in rj X (p are added around the seed crystal, in a range of Nsteps crystals in both directions of 
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(p, if their energies exceed a minimum threshold of The contiguous arrays are grouped 

into clusters, with each distinct cluster required to have a seed array with energy greater than a 
threshold of E^gd-array Order to be collected in the final global cluster, called the supercluster 
(SC). These threshold values are summarized in TableThey were originally tuned to provide 
best ECAL-energy resolution for electrons with px « 15 GeV, but eventually minor adjustments 
were made to provide the current performance over a wider range of pj values. 

The multi-5 x 5 algorithm is used in the ECAL endcaps (EE), where crystals are not arranged in 
anT] X (p geometry It starts with the seed crystals, the ones with local maximal energy relative 
to their four direct neighbours, which must fulfill an Ej requirement of E^ > E^'g^g^g^. 
Around these seeds and beginning with the largest Ex, the energy is collected in clusters of 
5x5 crystals, that can partly overlap. These clusters are then grouped into an SC if their total 
Ex satisfies E^ > E^^^uster' within a range in ?/ of and a range in ^ of 

around each seed crystal. These threshold values are summarized in Table The energy- 
weighted positions of all clusters belonging to an SC are then extrapolated to the planes of the 
preshower, with the most energetic cluster used as reference point. The maximum distance in 
(p between the clusters and their reference point are used to define the preshower clustering 
range along cp, which is then extended by ±0.15 rad. The range along t] is set to 0.15 in both 
directions. The preshower energies within these ranges around the reference point are then 
added to the SC energy. 

Table 2: Threshold values of parameters used in the hybrid superclustering algorithm in the 
barrel, and in the multi-5 x 5 superclustering algorithm in the endcaps. 


Barrel 

Parameter Value 

Endcaps 

Parameter Value 

pmin 
^T, seed 
pmin 

^seed-array 
pmin 
^ array 

^steps 

ICeV 

0.35 GeV 

0.1 GeV 
17(fii0.3rad) 

pmin 

^T, EEseed 
pmin 
^T, cluster 
grange 

grange 

0.18 GeV 

IGeV 

0.07 

0.3 rad 


The SC energy corresponds to the sum of the energies of all its clusters. The SC position is 
calculated as the energy-weighted mean of the cluster positions. Because of the non-projective 
geometry of the crystals and the lateral shower shape, a simple energy-weighted mean of the 
crystal positions biases the estimated position of each cluster towards the core of the shower. A 
better position estimate is obtained by taking a weighted mean, calculated using the logarithm 
of the crystal energy, and applying a correction based on the depth of the shower [3. 

Eigure|^ illustrates the effect of superclustering on the recovery of energy from simulated Z —> 
e+e^ events, comparing the energy reconstructed within the SC to the one reconstructed using 
a simple matrix of 5 x 5 crystals around the most energetic crystal in a) the barrel and b) the 
endcaps. The tails at small values of the reconstructed energy E over the generated one (Egen) 
are seen to be significantly reduced through the superclustering. 

In addition, as part of the PE-reconstruction algorithm, another clustering algorithm is intro¬ 
duced that aims at reconstructing the particle showers individually. The PE clusters are recon¬ 
structed by aggregating around a seed all contiguous crystals with energies of two standard 
deviations (cr) above the electronic noise observed at the beginning of the data-taking run, with 
Egeed > 230 MeV in the barrel, and Egged > 600 MeV or Ej > 150 MeV in the endcaps. An 
important difference relative to the stand-alone approach is that it is possible to share the en¬ 
ergy of one crystal among two or more clusters. Such clusters are used in different steps of 
electron reconstruction, and are hereafter referred to as PE clusters. 
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4 Electron reconstruction 




Figure 3: Comparison of the distributions of the ratio of reconstructed over generated energy 
for simulated electrons from Z boson decays in a) the barrel, and b) the endcaps, for energies 
reconstructed using superclustering (solid histogram) and a matrix of 5x5 crystals (dashed 
histogram). No energy correction is applied to any of the distributions. 

4.2 Electron track reconstruction 

Electron tracks can be reconstructed in the full tracker using the standard Kalman filter (KF) 
track reconstruction procedure used for all charged particles ||5l. However, the large radiative 
losses for electrons in the tracker material compromise this procedure and lead in general to a 
reduced hit-collection efficiency (hits are lost when the change in curvature is large because of 
bremsstrahlung), as well as to a poor estimation of track parameters. For these reasons, a dedi¬ 
cated tracking procedure is used for electrons. As this procedure can be very time consuming, 
it has to be initiated from seeds that are likely to correspond to initial electron trajectories. The 
key point for reconstruction is to collect the hits efficiently, while preserving an optimal estima¬ 
tion of track parameters over the large range of energy fractions lost through bremsstrahlung. 

4.2.1 Seeding 

The first step in electron track reconstruction, also called seeding, consists of finding and se¬ 
lecting the two or three first hits in the tracker from which the track can be initiated. The seed¬ 
ing is of primary importance since its performance greatly affects the reconstruction efficiency. 
Two complementary algorithms are used and their results combined. The ECAL-based seed¬ 
ing starts from the SC energy and position, used to estimate the electron trajectory in the first 
layers of the tracker, and selects electron seeds from all the reconstructed seeds. The tracker- 
based seeding relies on tracks that are reconstructed using the general algorithm for charged 
particles, extrapolated towards the ECAL and matched to an SC. These algorithms were first 
commissioned with data taken in 2010, using electrons from W boson decays. The distributions 
in data were found to agree with expectations, even at low pj, and tuning of the parameters 
obtained from simulation has been left essentially unchanged. 

In the ECAL-based seeding, the SC energy and position are used to extrapolate the electron 
trajectory towards the collision vertex, relying on the fact that the energy-weighted average 
position of the clusters is on the helix corresponding to the initial electron energy, propagated 
through the magnetic field without emission of radiation. The back propagation of the helix 
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parameters through the magnetic field from the SC is performed for both positive and nega¬ 
tive charge hypotheses. The intersections of helices with the innermost layers or disks predict 
the seeding hits. The SC are selected to limit the number of misidentified seeds using an Ej 
requirement of > 4GeV, together with a hadronic veto selection of H/Esc < 0.15, with 
Esc being the energy of the SC, and H the sum of the HCAL tower energies within a cone of 
AR = V(A?/)^ -|- {A(p)^ = 0.15 around the electron direction. This procedure reduces comput¬ 
ing time. 

On the other hand, tracker seeds are formed by combining pairs or triplets of hits with the 
vertices obtained from pixel tracks. Combinations of first and second hits from tracker seeds 
are located in the barrel pixel layers (BPix), the forward pixel disks (FPix), and in the TEC to 
improve the coverage in the forward region. Only a subset of the seeds leads eventually to 
tracks. 

For each SC, a seed selection is performed by comparing hits of each tracker seed and the SC- 
predicted hits within windows in (p and z (or in transverse distance r in the forward regions 
where hits are only in the disks). The windows for the first and second hits are optimized using 
simulation to maximize the efficiency, while reducing the number of misidentified candidates 
to a level that can be handled within the CPU time available for electron track reconstruction. 
The overall efficiency of the ECAL-based seeding is ?a92% for simulated electrons from Z boson 
decay. 

The windows for the first hit are wide, and adapted to the uncertainty in the measurement 
of (psc, and the spread of the beam spot in z {az, changing with beam conditions, and typically 
about 5 cm in 2012). The first <p window is chosen to depend on to reduce the misidentified 
candidates, and asymmetrical, to take into account the uncertainty on the collected energy 
of the SC. When the first hit of a tracker seed is matched, the information is used to refine 
the parameters of the helix, and to search for a second-hit compatibility with more restricted 
windows. A seed is selected if its first two hits are matched with the predictions from the SC. 

Tables and give the values of the first and second window acceptance parameters. For 
electrons with 5 < < 35 GeV, the first window size in (p {S(p) is a function of l/E®^. The 

point given at 10 GeV represents the median of the dependence on E^. 

Table 3: Values of the Sz, 5r and 3(p parameters used for the first window of seed selection, for 
three ranges of E^, with az being the standard deviation of the beam spot along the z axis. 
For electron candidates with negative charge, the same 5(p window is used, but with opposite 
signs. 


£SC(GeV) 

Sz 

(BPix) 

Sr 

(FPix or TEG) 

S(p (rad) 

(positive charge) 

<5 

±5(7z 

±5(7^ 

[-0.075; 0.155] 

10 

±5(7z 

±5(7^ 

[-0.046; 0.096] 

>35 

±5(7z 

±5(7^ 

[-0.026; 0.054] 


Table 4: Values of the 5z, Sr and S(p parameters used in different regions of the tracker for the 
second window of seed selection. 

Sz (cm) Sr (cm) Sr (cm) S(p (rad) S^ (rad) 

(BPix) (FPix) (TEG) (BPix) (FPix or TEG) 

±0.09 ifc0.15 ±0.2 ±0.004 ±0.006 

Figure a) and b) show respectively the differences Az 2 and A(p 2 between the measured and 
predicted positions in z (in the barrel pixels, BPix), and in (p (in all the tracker subdetectors). 
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4 Electron reconstruction 


for the second window of each electron track seed, in Z —)■ e^e^ events in data and in sim¬ 
ulation. The distributions in data are slightly wider than in simulation, with the effect more 
pronounced in A^ 2 / which is related directly to the difference in energy resolution between 
data and simulation. 




Figure 4: Distributions of the difference between predicted and measured values of the Zi 
and (p 2 variables for hits in the second window of the ECAL-based seeding, for electrons from 
Z —> e+e^ decays in data (dots) and simulation (histograms): a) Az 2 (barrel pixel), and b) A(p 2 
(all tracker subdetectors). The data-to-simulation ratios are shown below the main panels. 


Tracker-based seeding is developed as part of the PF-reconstruction algorithm, and comple¬ 
ments the seeding efficiency, especially for low-pj or nonisolated electrons, as well as for elec¬ 
trons in the barrel-endcap transition region. 

The algorithm starts with tracks reconstructed with the KF algorithm. The electron trajectory 
can be reconstructed accurately using the KF approach when bremsstrahlung is negligible. In 
this case, the KF algorithm collects hits up to the ECAL, the KF track is well matched to the 
closest PF cluster, and its momentum is measured with good precision. As a first step of the 
seeding algorithm, each KF track, with direction compatible with the position of the closest PF 
cluster that fulfills the matching-momentum criterion of < E/p < 3, has its seed selected 
for electron track reconstruction. The cutoff is set to 0.65 for electrons with 2 < px < 6 GeV, 
and to 0.75 for electrons with px > 6 GeV. 


For tracks that fail the above condition, indicating potential presence of significant bremsstrah¬ 
lung, a second selection is attempted. As the KF algorithm cannot follow the change of curva¬ 
ture of the electron trajectory because of the bremsstrahlung, it either stops collecting hits, or 
keeps collecting them, but with a bad quality identified through a large value of the Xkf- 
KF tracks with a small number of hits or a large Xkf therefore refitted using a dedicated 
Gaussian sum filter (GSF) 1251 , as described in Section[4.2.2[ 


The number of hits and the quality of the KF track Xkf' quality of the GSF track Zgsf' 
and the geometrical and energy matching of the EGAL and tracker information are used in a 
multivariate (MYA) analysis 12^ to select the tracker seed as an electron seed. 


The electron seeds found using the two algorithms are combined, and the overall efficiency of 
the seeding is predicted >95% for simulated electrons from Z boson decay. 
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4.2.2 Tracking 

The selected electron seeds are used to initiate electron-track building, which is followed by 
track fitting. The track building is based on the combinatorial KF method, which for each 
electron seed proceeds iteratively from the track parameters provided in each layer, including 
one-by-one the information from each successive layer i5|. The electron energy loss is modelled 
through a Bethe-Heitler function. To follow the electron trajectory in case of bremsstrahlung 
and to maintain good efficiency, the compatibility between the predicted and the found hits in 
each layer is chosen not to be too restrictive. When several hits are found compatible with those 
predicted in a layer, then several trajectory candidates are created and developed, with a limit 
of five candidate trajectories for each layer of the tracker. At most, one missing hit is allowed for 
an accepted trajectory candidate, and, to avoid including hits from converted bremsstrahlung 
photons in the reconstruction of primary electron tracks, an increased penalty is applied to 
trajectory candidates with one missing hit. Figure]^ shows the number of hits collected using 
this procedure for electrons from a Z boson sample in data and in simulation, compared with 
the KF procedure used for all the other charged particles in the barrel and in the endcaps. The Z 
boson selections in data and in simulation require both decay electrons to satisfy pj > 20 GeV, 
several criteria pertaining to isolation and to rejection of converted photons, and a condition 
of |ffJe+g- — mz\ < 7.5GeV on their invariant mass. The structure in the figure reflects the 
geometry of the tracker. This comparison shows that shorter electrons tracks are obtained using 
the standard KF than using the dedicated electron building. The number of hits for the KF 
procedure is set to zero when there is no KF track associated with the electron. While the 
general behaviour is well reproduced, disagreement is observed between data and simulation 
due to an imperfect description of the active tracker sensors in the simulation. 




Figure 5: Gomparison of the number of hits collected with the dedicated electron building and 
KF procedures in data (symbols) and in simulation (histograms), for electrons obtained using 
a Z —> e+e^ selection, a) in the barrel, and b) in the endcaps. 

Once the hits are collected, a GSF fit is performed to estimate the track parameters. The energy 
loss in each layer is approximated by a mixture of Gaussian distributions. A weight is at¬ 
tributed to each Gaussian distribution that describes the associated probability. Two estimates 
of track properties are usually exploited at each measurement point that correspond either to 
the weighted mean of all the components, or to their most probable value (mode). The former 
provides an unbiased average, while the latter peaks at the generated value and has a smaller 
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standard deviation for the core of the distribution |!3|. This is shown in Fig. where the ratio 
Pt/ is compared for the two estimates, for simulated electrons from Z boson decays. For 
these reasons, the mode estimate is chosen to characterize all the parameters of electron tracks. 



Figure 6: Distribution of the ratio of reconstructed over generated electron px in simulated 
Z —> e+e“ events, reconstructed through the most probable value of the GSF track components 
(solid histogram), and its weighted mean (dashed histogram). 

This procedure of track building and fitting provides electron tracks that can be followed up 
to the ECAL, and thereby extract track parameters at the surface of the ECAL. The fraction of 
energy lost through bremsstrahlung is estimated using the momentum at the point of closest 
approach to the beam spot (pm), and the momentum extrapolated to the surface of the ECAL 
from the track at the exit of the tracker (pout)/ and is defined as fhrem = [pin — Pout] / Pin- This 
variable is used to estimate the electron momentum, and it enters into the identification proce¬ 
dure. In Fig. this observable is shown for Z e+e^ data and simulated events, as well as 
for misidentified electron candidates from jets in data enriched in Z+jets, in four regions of the 
ECAL barrel and endcaps. Each distribution is normalized to the area of the Z —?► e+e^ data. As 
mentioned above, the Z boson selections in data and in simulation require both decay electrons 
to satisfy px > 20 GeV, as well as several isolation and photon conversion rejection criteria, and 
a condition of \mg+g- — mz\ < 7.5GeV on their invariant mass. The sample of misidentified 
electrons is obtained by selecting nonisolated electron candidates with px > 20 GeV, in events 
selected with a pair of identified leptons (electrons or muons) with invariant mass compati¬ 
ble with that of the Z boson, and an imbalance in transverse momentum smaller than 25 GeV. 
When a bremsstrahlung photon is emitted prior to the first three hits in the tracker, leading 
to an underestimation of pm, or when the amount of radiated energy is very low, the pout and 
Pin have similar values, and pout can be measured to be greater than pin, leading thereby to 
negative values of /brem- hr the central barrel region, the amount of intervening material is 
small, and the bremsstrahlung fraction peaks at low values, contrary to the outer region, where 
the amount of material is large and leads to a sizable population of electrons emitting high 
fractions of their energies through bremsstrahlung. For the background, chiefly composed of 
hadron tracks misidentified as electrons, the bremsstrahlung fraction generally peaks at very 
small values. The increased contribution of background at high values of bremsstrahlung frac¬ 
tion that can be observed in Figs.|^), c), and d), is ascribed to residual early photon conversions 
and nuclear interactions within the tracker material. 
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The disagreement observed between data and simulation in the endcap region is attributed 
to an imperfect modelling of the material in simulation. In fact, the /brem variable is a perfect 
tool for accessing the intervening material, and a direct comparison of the mean value of /brem 
in data and in simulation in narrow bins of t] indicates that the description of the material in 
certain regions is imperfect. For example, a localized region near 1 ~ 0.5 where there are com¬ 
plicated connections of the TOB to its wheels, and beyond |?/| ~ 0.8 where there is a region of 
inactive material, do not have the material properly represented in the simulation [271. The ob¬ 
served difference between data and simulation, relevant for updating the simulated geometry 
in future analyses, is taken into account in the analysis of 8 TeV data, through specific correc¬ 
tions applied to the electron momentum scale, resolution, and identification and reconstruction 
efficiencies extracted from Z —> e+e^ events, as discussed in Sections 4.8.4 and|^ 



19.7 fb'^ (8 TeV) 



19.7fb'^ (8 TeV) 


19.7 fb'^ (8 TeV) 




Figure 7: Distribution of /brem for electrons from Z —> e+e^ data (dots) and simulated (solid 
histograms) events, and from background-enriched events in data (triangles), in a) the central 
barrel \r]\ < 0.8, b) outer barrel 0.8 < \f]\ < 1.44, c) endcaps 1.57 < \-t]\ <2, and d) endcaps 
I?/1 >2. The distributions are normalized to the area of the Z —> e+e^ data distributions. 
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4.3 Electron particle-flow clustering 

The PF clustering of electrons is driven by GSF tracks, and is independent of the way they are 
seeded. For each GSF track, several PF clusters, corresponding to the electron at the EGAL 
surface and the bremsstrahlung photons emitted along its trajectory, are grouped together. The 
PF cluster corresponding to the electron at the EGAL surface is the one matched to the track at 
the exit of the tracker. Since most of the material is concentrated in the layers of the tracker, for 
each layer a straight line is extrapolated to the EGAL, tangent to the electron track, and each 
matching PF cluster is added to the electron PF cluster. Most of the bremsstrahlung photons are 
recovered in this way, but some converted photons can be missed. For these photons, a specific 
procedure selects displaced KF tracks through a dedicated MVA algorithm, and kinematically 
associates them with the PF clusters. In addition, for EGAL-seeded isolated electrons, any 
PF clusters matched geometrically with the hybrid or multi-5 x 5 SG are also added to the PF 
electron cluster. 

4.4 Association between track and cluster 

The electron candidates are constructed from the association of a GSF track and a cluster in 
the EGAL. For EGAL-seeded electrons, the EGAL cluster associated with the track is simply 
the one reconstructed through the hybrid or the multi-5 x 5 algorithm that led to the seed. For 
electrons seeded only through the tracker-based approach, the association is made with the 
electron PF cluster. 

The track-cluster association criterion, just like the seeding selection, is designed to preserve 
highest efficiency and reduced misidentification probability, and it is therefore not very restric¬ 
tive along the direction of the track curvature affected by bremsstrahlung. For EGAL-seeded 
electrons, this requires a geometrical matching between the GSF track and the SG, such as: 

• \At]\ = \rjsc — < 0.02, with rjsc being the SG energy-weighted position in rj, 

and the track rj extrapolated from the innermost track position and direction 

to the position of closest approach to the SG, 

• \A(p\ = |(|>sc — < 0.15, with analogous definitions for (p. 

For tracker-seeded electrons, a global identification variable is defined using an MVA technique 
that combines information on track observables (kinematics, quality, and KF track), the electron 
PF cluster observables (shape and pattern), and the association between the two (geometric and 
kinematic observables). For electrons seeded only through the tracker-based approach, a weak 
selection is applied on this global identification variable. For electrons seeded through both 
approaches, a logical OR is applied on the two selections. 

The overall efficiency is ~93% for electrons from Z decay, and the reconstruction efficiency 
measured in data is compared to simulation in Section [6d] 

4.5 Resolving ambiguity 

Bremsstrahlung photons can convert into e+e^ pairs within the tracker and be reconstructed 
as electron candidates. This is particularly important for \ri\ > 2, where electron seeds can 
be used from layers of the tracker endcap that are located far from the interaction vertex and 
away from the bulk of the material. In such topologies, a single electron seed can often lead 
to several reconstructed tracks, especially when a bremsstrahlung photon carries a significant 
fraction of the initial electron energy, so that the hits corresponding to the converted photon are 
located close to the expected position of the initial track. This creates ambiguities in electron 
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candidates, when two nearby GSF tracks share the same SC. 

To resolve this problem, the following criteria are used, based on the small probability of a 
bremsstrahlung photon to convert in the tracker material just after its point of emission. The 
number of missing irmer hits is obtained from the intersections between the track trajectory 
and the active inner layers. 

• When two GSF tracks have a different number of missing inner hits, the one with 
the smallest number is retained. 

• When the number of missing inner hits is the same, and both candidates have an 
ECAL-based seed, the one with Esc / V closest to unity is chosen, where p is the track 
momentum evaluated at the interaction vertex. 

• The same criterion is also applied when both candidates have the same number of 
missing irmer hits and just tracker-based seeds. 

• When the number of missing irmer hits is the same, but only one candidate is just 
tracker-seeded, the track with an ECAL-based seed is chosen, because the tracks 
from tracker-based seeds have a higher chance to be contaminated by track segments 
from conversions. 

4.6 Relative ECAL to tracker alignment with electrons 

Electrons are also used to probe subtle detector effects such as the ECAL alignment relative to 
the tracker. The tracker was first aligned using cosmic rays before the start of LHC operations, 
and constantly refined using proton-proton collisions, reaching an accuracy < 10 pm llT^ . The 
relative alignment of the tracker to the ECAL for 2012 data is obtained using electrons from Z 
boson decays. Tight identification and isolation criteria are applied to both electrons with Ej > 
30GeV, and the dielectron invariant mass is required to be |?«e+e- “ ^z| < 7.5 GeV, to ensure 
a high signal purity of 97%, needed for the alignment procedure. In addition, to disentangle 
bremsstrahlung effects from position reconstruction, only electrons with little bremsstrahlung 
and best energy measurement are considered. The distances Ap and Acp, defined in Section |4!^ 
are compared between data and simulation, the ECAL being aligned with the tracker in the 
simulation. The position of each supermodule in the barrel and each half-disk in the endcaps 
is measured relative to the tracker by minimizing the differences between data and simulation 
as a function of the alignment coefficients. Residual misalignments lower than 2 x 10^^ rad 
in A(p and 2 x 10^^ units in Arj, are obtained using this procedure, which is compatible with 
expectations from simulation. 

4.7 Charge estimation 

The measurement of the electron charge is affected by bremsstrahlung followed by photon 
conversions. In particular, when the bremsstrahlung photons convert upstream in the detector, 
they lead to very complex hit patterns, and the contributions from conversions can be wrongly 
included in the fitting of the electron track. 

A natural choice for a charge estimate is the sign of the GSE track curvature, which unfortu¬ 
nately can be altered by the misidentification probability in presence of conversions, especially 
for \t]\ >2, where it can reach about 10% for reconstructed electrons from Z boson decay with¬ 
out further selection. This is improved by combining two other charge estimates, one that is 
based on the associated KE track matched to a GSE track when at least one hit is shared in the 
irmermost region, and the second one that is evaluated using the SC position, and defined as 
the sign of the difference in cp between the vector joining the beam spot to the SC position and 
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the vector joining the beam spot and the first hit of the electron GSF track. 

The electron charge is defined by the sign shared by at least two of the three estimates, and 
is referred to as the "majority method". The misidentification probability of this algorithm is 
predicted by simulation to be 1.5% for reconstructed electrons from Z boson decays without 
further selection, offering thereby a global improvement on the charge-misidentification prob¬ 
ability of about a factor 2 relative to the charge given by the GSF track curvature alone. It also 
reduces the misidentification probability at very large \r]\, where it is predicted to be <7% for 
such electrons. Higher purity can be obtained by requiring all three measurements to agree, 
termed the "selective method". This yields a misidentification probability of <0.2% in the cen¬ 
tral part of the barrel, <0.5% in the outer part of the barrel, and <1.0% in the endcaps, which 
can be achieved at the price of an efficiency loss that depends on pj, but is typically %7% for 
electrons from Z boson decays. The selective algorithm is used mainly in analyses where the 
charge estimate is crucial, for example in the study of charge asymmetry in inclusive W boson 
production [28|, or in searches for supersymmetry using same-charge dileptons Il29ll . 

The charge misidentification probability decreases strongly when the identification selections 
become more restrictive, mainly because of the suppression of photon conversions. Table 
gives the measurement in data and simulation of the charge misidentification probability that 
can be achieved for a tight selection of electrons (corresponding to the HLT criteria) from 
Z —> e+e^ decays in the barrel and in the endcaps, for the majority and the selective meth¬ 
ods. These values are estimated by comparing the number of same-charge and opposite-charge 
dielectron pairs that are extracted from a fit to the dielectron invariant mass. The misidentifica¬ 
tion probability is significantly reduced relative to the one at the reconstruction level. A good 
agreement is found between data and simulation in both EGAL regions and for both charge- 
estimation methods. 

Table 5: Gharge misidentification probability for a tight selection of electrons from Z —> e^e^ 
decays in the barrel and in the endcaps, for the majority and for the selective methods used to 
estimate electron charge. Only statistical uncertainties are shown in the table. 



Barrel 

Endcaps 

Method 

Simulation 

Data 

Simulation 

Data 

majority 

0.13 ± 0.01% 

0.14 ± 0.01% 

1.4 ± 0.2% 

1.6 ± 0.2% 

selective 

0.017 ± 0.002% 

0.020 ± 0.002% 

0.21 ± 0.02% 

0.23 ± 0.02% 


4.8 Estimation of eiectron momentum 

The electron momentum is estimated using a combination of the tracker and EGAL measure¬ 
ments. As for all electron observables, it is particularly sensitive to the pattern of brems- 
strahlung photons and their conversions. To achieve the best possible measurement of electron 
momentum, electrons are classified according to their bremsstrahlung pattern, using observ¬ 
ables sensitive to the emission and conversion of photons along the electron trajectory. The 
SG energy is corrected and calibrated, then the combination between the tracker and EGAL 
measurements is performed. 


4.8.1 Classification 


For most of the electrons, the bremsstrahlung fraction in the tracker /brem/ defined in Sec¬ 


tion 

-PF 


4.2.2 


is complemented by the bremsstrahlung fraction in the EGAL, defined as 


[Eg^ - £gjg]/£P, where EP and E 


PF 


ECAL 

brem 


eiej/ j-sc, vviicic i..g(g emu Ugjg are the SG energy and the electron-cluster energy mea¬ 
sured with the PF algorithm, that correspond respectively to the initial and final electron ener¬ 
gies. The number of clusters in the SG is also used in the classification process. 
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Electrons are classified in the following categories: 

• "Golden" electrons are those with little bremsstrahlung and consequently provide 
the most accurate estimation of momentum. They are defined by an SC with a single 
cluster and /brem < 0-5. 

• "Big-brem" electrons have a large amount of bremsstrahlung radiated in a single 
step, either very early or very late along the electron trajectory. They are defined by 
an SC with a single cluster and /brem > 0-5. 

• "Showering" electrons have a large amount of bremsstrahlung radiated all along the 
electron trajectory, and are defined by an SC containing several clusters. 

In addition, two special electron categories are defined. One is termed "crack" electrons, de¬ 
fined as electrons with the SC seed crystal adjacent to an ?/ boundary between the modules of 
the ECAL barrel, or between the ECAL barrel and endcaps, or at the high | rj \ edge of the end- 
caps. The second category, called "bad track", requires a calorimetric bremsstrahlung fraction 
that is significantly larger than the track bremsstrahlung fraction — /brem > 0.15), which 

identifies electrons with a poorly fitted track in the innermost part of the trajectory 

Eigure a) shows the fraction of the electron population in the above classes, as a function 
of I I (defined relative to the centre of CMS), for data and simulated electrons from Z boson 
decays. Crack electrons are not shown in the plot, but complement the proportion to unity The 
distributions for the golden and showering classes reflect the rj distribution of the intervening 
material. Data and simulation agree well, except for the regions of rj with known mismodelling 
of material, and for \rj\ > 2 , where the number of clusters is overestimated in the simulation. 
The integrated proportions of electrons in the different classes for data and simulation are, 
respectively, 57.4% and 56.8% for showering, 25.5% and 26.3% for golden, 8.4% and 8.0% for 
big-brem, 4.1% and 4.1% for bad track, and 4.6% and 4.7% for crack electrons. Eigure|^b) shows 
the distributions in the ratio of reconstructed SC energy to the generated energy (Egen) for the 
different classes. The SC performs differently for each class, and provides an energy estimate 
of limited quality for electrons with sizeable bremsstrahlung. An improved energy estimate is 
achieved with additional corrections, as discussed in the following section. 

4.8.2 ECAL supercluster energy 

Energy in individual crystals Several procedures are used to calibrate the energy response 
of individual crystals before the clustering step 01. The amplitude in each crystal is recon¬ 
structed using a linear combination of the 40 MHz sampling of the pulse shape. This amplitude 
is then converted into an energy value using factors measured separately for the ECAL barrel, 
endcaps, and the preshower detector. The changes in the crystal response induced by radiation 
are corrected through the ECAL laser-monitoring system USOllMl . and the correction factors are 
checked using the reconstructed dielectron invariant mass in Z —)■ e^e^ events, and through 
the ratio of the ECAL energy and the track momentum (Esc/p) in W —?■ ev events. The inter¬ 
calibration factors between crystals are obtained with data using different methods, e.g. the 
(p symmetry of the energy in minimum-bias events for a given p, the reconstructed invariant 
mass of Tz^ —>• 77 , p —> 77 , and Z —> e+e^ events, and the Esc/P ratio of electrons in W —> ei/ 
events. 

Supercluster energy correction The SC energy is obtained by summing the individual 
energies in all the crystals of an SC, and the preshower energies of electrons in the endcaps. 
At this stage, the main effects impacting the estimation of SC energy are related to energy 
containment: 
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CMS 19.7fb'^ (8TeV) 




Figure 8: a) Fraction of population in different classes of electrons from Z boson decays as 
a function of \ri\, for data (dots) and simulated (histograms) events, and b) distribution of 
Esc/£gen for the different classes of simulated electrons. Crack electrons are not shown in 
either plot. 


• energy leakage incp or t] out of the SC, 

• energy leakage into the gaps between crystals, modules, supermodules, and the 
transition region between barrel and endcaps, 

• energy leakage into the HCAL downstream the ECAL, 

• energy loss in interactions in the material before the ECAL, and 

• additional energy from pileup interactions. 


An MVA regression technique 1^ is used to obtain the SC corrections that are needed to ac¬ 
count for these effects. Simulated electrons with a uniform spectrum in t] and px between 5 and 
300 GeV are used to train the regression algorithm, separately for electrons in the barrel and in 
the endcaps. The regression target is the ratio Egen/Esc- The first input observables are the SC 
energy to be corrected, and the SC position in t] and (p, which are related to the intervening 
material. The energy leakage out of the SC is assessed through the SC shape observables and 
its number of clusters, together with their individual respective positions, energies, and shape 
observables. The energy leakage in the gaps between modules, supermodules and in the transi¬ 
tion region between the barrel and endcaps is explored through the position of the seed crystal 
of the SC. The position of the seed cluster relative to the seed crystal is used together with the 
shower-shape observables to account for energy leakage between the crystals. The ratio H / Esc 
(defined in Section 4.2. 1| | is used to estimate the energy leakage into the HCAL. The effects of 
pileup interactions are assessed through the number of reconstructed interaction vertices and 
the average energy density p in the event (defined as the median of the energy density distri¬ 
bution for particles within the area of any jet in the event, reconstructed using the /cj-clustering 
algorithm Il33ll34l with distance parameter of 0.6, > 3GeV and within \rj\ < 2.5). 


Figure shows the distribution in the ratio of the corrected SG energy over the generated en¬ 
ergy Eg“ / Egen, obtained through the regression for two categories of simulated electrons: low- 
Pt electrons (7 < px < 10 GeV) in the central part of the barrel, and medium-px electrons 
(30 < Px < 35 GeV) in the forward part of the endcaps. The distributions are fitted with a 
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"double" Crystal Ball function 1351 . The Crystal Ball function is defined as: 


foi{x)a,n,mcB,crai) = N < 


A 


exp 


B - 


X - ntcB 
CtB 
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z - mcB . 

, for -< —a 

(XCB 

X - mcB ^ 

, for -> —a 

(XQB 
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where A and B are functions of cc and n, and N is a normalization factor. This function is 
intended to capture both the Gaussian core of the distribution (described by ctcb) and non- 
Gaussian tails (described by the parameters n and a). The double Grystal Ball function is a 
modified Grystal Ball with the (Tcb, n, and a parameters distinct for x values below and above 
the peak position at nicB- 


The peak position and the standard deviation of the Gaussian core of the distributions are es¬ 
timated through the fitted values of ntcB and ucb, respectively The "effective" standard devi¬ 
ation (Tgff, defined as half of the smallest interval around the peak position containing 68.3% of 
the electrons, is used to assess the resolution, while taking into account possible non-Gaussian 
tails. A bias of at most 1% affects the peak position, which reflects the asymmetric nature of the 
£gen/ Esc distribution. 



Figure 9: Example distributions of the ratio of corrected over generated supercluster energies 
(£g“ / £gen) and their (double Grystal Ball) fits, in two regions of and pj after implementing 
the regression corrections: for electrons a) with 7 < < lOGeV and |?/sc| < 1/ and b) with 

30 < < 35GeV and 2 < |?/sc| < 2.5, //sc being defined relative to the centre of GMS. 

Electrons are generated with uniform distributions in // and pj. 


The peak position of / £gen and the effective resolution for £g“ are shown in Fig. 10 


as a 


function of the number of reconstructed interaction vertices for low-px and medium-px elec¬ 
trons, in the barrel and in the endcaps. The bias in the peak position is independent of the 
number of pileup interactions. The effective resolution is in the range of 2-3% for medium-px 
electrons in the barrel, and in the range of 7-9% for low-px electrons in the endcaps, degrading 
slowly with increasing number of pileup interactions. 


The use of the MVA regression technique compared to a standard parameterization of the cor¬ 
rection for Esc a function of the electron //, category, and Ej, provides significant improve¬ 
ment of %20% in the resolution on average and up to ^35% in the forward regions, while 
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reducing the bias in the peak position for each electron class over the entire range of electron // 
and Pt- 




Figure 10: a) Peak position of £g“/£gen/ and b) effective resolution of £g“, as a function of 
the number of reconstructed interaction vertices, for electrons in the barrel (solid symbols) and 
endcaps (open symbols) with 7 < < 20GeV (circles), and 20 < < 50GeV (squares). 

Electrons are generated with uniform distributions in p and pj. 

Another MVA regression technique, based on the same input variables, is used to estimate the 
uncertainty in the corrected E^c, separately for electrons in the barrel and in the endcaps, with 
the absolute difference between £cb and the corrected £sc being the target. 

Fine-tuning of caiibration and simuiated resoiution The SG energy corrections described 
above are based on simulation. Events in data are used to account for any discrepancy between 
data and simulation in input variables, as well as to correct for biases. The applied remnant 
corrections are quite small. The energy in individual crystals is already calibrated, and sim¬ 
ulation of showers in the EGAL is rather precise and includes the measured uncertainties in 
the inter-calibration between crystals. The main source of discrepancy between the energy 
estimate in data and in simulation is the imperfect description of the tracker material in simu¬ 
lation, which affects differently each category of electrons. The evolution of the transparency 
of the crystals and of the noise in the EGAL during data taking, if not considered through 
specific run-dependent simulations, leads to an additional difference between data and simu¬ 
lation. Another possible source of discrepancy could be the underestimation of uncertainties 
in the calibration of individual crystals. Finally, a difference in the EGAL geometry relative to 
the nominal one can cause the corrections discussed in the previous paragraph, which are ob¬ 
tained using simulated events with the nominal geometry, to be inappropriate for data. While 
it is now understood that at least one of the above effects contributes to degradation, their 
relative magnitudes are not as fully clear. More details on this issue can be found in Ref. Il27l . 

The SG energy scale is corrected in the data to match that in simulation. These corrections are 
assessed using Z —> e+e^ events, by comparing the dielectron invariant mass in data and in 
simulation for four | p | regions and two categories of electrons, over 50 running periods, fol¬ 
lowing the procedure described in Ref. |!2- The p regions are defined from the most central 
to the most forward values as barrel |p| < 1, barrel |p| > 1, endcaps |p| < 2, and endcaps 
I p I > 2. The Rg variable, defined as the ratio of the energy reconstructed in the 3x3 crys- 
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tals matrix centered on the crystal with most energy and the SC energy, is used to assess the 
amount of bremsstrahlung emitted by the electron. The category of electrons with a low level 
of bremsstrahlung is defined by Rg > 0.94, and the one with a high level of bremsstrahlung 
by Rg < 0.94. The Z boson mass is reconstructed from the SC energies and the opening angles 
measured from the tracks. The mass distribution in the range between 60 and 120 GeV is fitted 
using a Breit-Wigner convolved with a Crystal Ball function, both for data and simulation. The 
scale corrections, obtained from the difference between the peak positions measured in the data 
and in simulation, are applied to the data, so that the peak position of the Z boson mass agrees 
with that in simulation, in each category Overall, these corrections vary between 0.9880 and 
1.0076 and their uncertainties between 0.0002 and 0.0029. 

The estimate of the SC energy resolution is also affected by the sources of discrepancy between 
data and simulation. A correction is applied in simulation to match the resolution observed in 
data lij. This correction is independent of time, and evaluated for the above categories of tj 
and Rg. The SC energy is modified by applying a factor drawn from a Gaussian distribution, 
centered on the corrected scale value, and with a standard deviation of corresponding to a 
required additional constant term in the energy resolution. The value of for each electron 
category is assessed using a maximum-likelihood fit of the data to a resolution-broadened sim¬ 
ulated energy. This constant term in the energy resolution ranges from (0.92 ± 0.03)% in the 
\r]\ < 1 and Rg > 0.94 category, to (2.90 ± 0.03)% in the \ri\ > 2 and Rg < 0.94 category. The 
uncertainty in the SG energy is increased accordingly 

4.8.3 Combination of energy and momentum measurements 

The electron momentum estimate Pcomb is improved by combining the EGAL SG energy, after 
applying the refinements mentioned in the previous sections, with the track momentum. At 
energies <15 GeV, or for electrons near gaps in detectors, the track momentum is expected to 
be more precise than the EGAL SG energy. A regression technique is used to define a weight 
w that multiplies the track momentum in linear combination with the estimated SG energy as 

Pcomb = wp + {l- w)Esc. 

The complementarity of the two estimates depends on the amount of emitted bremsstrahlung. 
The corrected SG energy and its relative uncertainty, and the track momentum and its relative 
uncertainty are the main input observables. The addition of the Esc / P ratio and its uncertainty, 
together with the ratio of the two relative uncertainties, brings a higher-level information that 
optimizes the performance of the regression. The electron class and the position in the barrel 
or endcaps are also included as probes of the quality and amount of emitted bremsstrahlung. 

After combining the two estimates, the bias in the electron momentum is reduced in all re¬ 
gions and all electron classes, except for showering electrons in the endcaps, where the bias 
becomes slightly worse. Eigurej^shows the effective resolution in the electron momentum (in 
percent), after combining the Esc ^rid p estimates, as a function of the generated pj, compared 
to the effective resolution of the corrected SG energy, for golden electrons in the barrel and 
for showering electrons in the endcaps. The improvement is typically 25% for electrons with 
Pj ~ 15 GeV in the barrel and reaches 50% for golden electrons of pj < 10 GeV. 

The improvement in resolution is significant for all electrons in the barrel up to energies of 
about 35 GeV, as can be seen in Eig. a), which displays the effective resolution of the cor¬ 
rected SG energy, of the track momentum, and of the electron momentum after combining Esc 
and p estimates, as a function of the generated electron energy Eigure b) shows the ex¬ 
pected reconstructed mass for a 126 GeV Higgs boson in the H —)■ ZZ* —> 4e decay charmel. 
The masses reconstructed using the corrected SG energy are compared to those using the elec- 
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Figure 11: Effective resolution in electron momentum after combining the Esc p estimates 
(solid symbols), compared to that of the corrected SC energy (open symbols), as a function of 
the generated electron pj. Golden electrons in the barrel (circles) and showering electrons in the 
endcaps (squares) are shown as examples. Electrons are generated with uniform distributions 
in t] and pj, and the resolution is shown after applying the resolution broadening. 

tron momentum obtained after combining the Esc p estimates. The improvement in the 
effective resolution is 7%. When considering only the Gaussian core of the distribution, the 
improvement in the resolution is 9%. 

4.8.4 Uncertainty in the momentum scaie and in the resoiution 

The corrections to the momentum scale and resolution discussed above are only obtained from 
correcting the SG energy in Z —> e+e^ events. As a consequence, they must be further cor¬ 
rected, first over a large range of pj, especially for the H —ZZ* analysis which uses electrons 
with pj as low as 7 GeV, and second for the Esc arid p combination. For this purpose, Z —> e+e^ 
events are used together with ]/ijj —> e+e^ and Y —> e+e^ events that provide clean sources of 
electrons at low pj. The reconstructed invariant masses of these resonances in data are com¬ 
pared with simulation to probe any remaining differences. 

Figure shows an example of such comparisons and their degree of agreement for two ex¬ 
treme categories of events: one where each electron is well measured, having a single-cluster 
SG (golden or big-brem class) in the barrel, and the other one where each electron has a multi¬ 
cluster SG, or is poorly-measured (showering, crack, or bad track class) in the endcaps. These 
two categories represent the breadth of performance in data that enters, for example, in the 
mass measurement of the benchmark process for Higgs boson decays to four leptons. The dis¬ 
tributions in data and in simulation are fitted with a Breit-Wigner function convolved with a 
Grystal Ball function, 

P(me+e-;?nz,Fz,A:,n,mcB,t7cB) = BW(me+e-;?«Z/rz) G /cB(me+e-;a,W,mcBWCB)/ 
where niz and Fz are fixed to the nominal values of 91.188 and 2.485 GeV Il3^ . 

The effective standard deviation c^eff/ which is indicated in the plots, is calculated as the effec¬ 
tive standard deviation of the function /cb/ which therefore does not include the contribution 
from the width of the Z boson. In both categories of events, the data and simulation show 
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Figure 12: a) Effective resolution in electron momentum after combining Esc p estimates 
(solid circles), compared to those using the corrected SC energy (triangles), and the track mo¬ 
mentum (squares), as a function of the generated energy for electrons in the barrel. Also shown 
is the resolution in momentum after combining Esc p estimates in terms of acQ (open cir¬ 
cles), to illustrate the contribution of the Gaussian core of the distribution. Electrons are gen¬ 
erated with uniform distributions in rj and pj. b) Reconstructed mass of the Higgs boson for 
H(126) —> ZZ* —> 4e simulated events, using either the corrected SC energy (open triangles) 
or the electron momentum after combining Esc p estimates (solid dots) Q- 
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Figure 13: Dielectron invariant mass distribution from Z —> e+e^ events in data (solid squares) 
compared to simulation (open circles) fitted with a convolution of a Breit-Wigner function 
and a Crystal Ball function, a) for the best-resolved event category with two well-measured 
single-cluster electrons in the barrel (BGBG), and b) for the worst-resolved category with two 
more-difficult patterns or multi-cluster electrons in the endcaps (ESES). The masses at which 
the fitting functions have their maximum values, termed wjpeak/ and the effective standard de¬ 
viations (Jeff are given in the plots. The data-to-simulation factors are shown below the main 
panels. 
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good agreement. The (Jeff in data for the Z —> e+e“ invariant mass are, respectively for the 
best and worst categories, 1.13 ± 0.01 GeV and 2.88 ± 0.02 GeV. Gonsidering only the Gaussian 
cores of the distribution, the standard deviations (ocb) are 1.00 ± 0.01 GeV and 2.63 ± 0.02 GeV, 
for the best and the worst categories, respectively. The effective and Gaussian invariant mass 
resolutions of dielectron events in the data range, respectively, from 1.2 and 1.1% for the best 
category with two well-measured single-cluster electrons in the barrel, to 3.2 and 2.9% for the 
worst category with two poorly-measured or multi-cluster electrons in the endcaps. The effec¬ 
tive and Gaussian momentum resolutions for single electrons, approximated by multiplying 
the dielectron mass resolution by \/2, therefore range in data from 1.7 and 1.6%, to 4.5 and 
4.1%, respectively. 

The data-to-simulation comparisons are performed for different categories of events based on 
y], Pj, and class of electron, and for different instantaneous luminosities. The scale corrections 
are applied to data, and the resolutions are broadened in the simulated distributions, as dis¬ 
cussed in Section l4.8.2l 

For study of the momentum scale, the px and t] categories are defined according to the px 
and Tj of one of the two electrons, the other electron is used to tag the Z event, it satisfies 
tight identification requirements (as described in Section |^, and has px > 20 GeV. The fits 
are performed using signal templates (obtained from simulation as binned distributions) that 
are convolved with Gaussians with floating means and standard deviations. A px-dependence 
of the momentum scale of up to 0.6% in the barrel and 1.5% in the endcaps is observed and 
corrected in the px range between 7 and 70 GeV. The final performance of the momentum scale 
is shown in Fig.j^a) as the relative difference between data and simulation of the J/tp —> e^e , 
Y —> e+e^, and the Z —> e+e^ mass peaks, as a function of the px of one electron and for 
several p regions of this electron, integrating over the px and p of the other electron. The 
residual scale difference between data and MG simulation is at most 0.2% in the barrel and 
0.3% in the endcaps. These numbers are taken as systematic uncertainties on the momentum 
scale of electrons in the barrel and in the endcaps. For the study of the resolution, the px, p, and 
class categories are defined for both electrons from the Z decay. The fits are performed using 
a Breit-Wigner function convolved with a Grystal Ball function. The agreement between data 
and simulation in effective resolution is shown in Fig. j^b), in terms of the relative difference 
between data and simulation for the ]/xp —> e+e^ and Z —> e+e^ events, as a function of 
the Px of one electron, for different categories of electrons. Overall the relative difference in 
effective resolution between data and simulation is less than 10% for all the categories in this 
comparison. 

4.8.5 High-energy electrons 

For high-energy electrons, the FgC ^rid p combination is dominated entirely by the energy mea¬ 
surement in the EGAL. Because of this and for reasons of simplicity, analyses exploiting high- 
energy electrons, with typical energies above 250 GeV, estimate the electron momentum using 
only the SG information. Moreover, energy deposition from very high-energy electrons (from 
about 1500 GeV in the barrel and from about 3000 GeV in the endcaps) lead to a saturation of 
the front-end electronics liTTI . 

Both the calibration of high-energy electrons and the energy correction for saturated crystals 
are tuned with Z —> e+e^ events through a method that estimates the energy contained in the 
central (highest energy) crystal of a 5 x 5 matrix, using the 24 lower-energy surrounding crys¬ 
tals. The energy fraction contained in the central crystal relative to the 5x5 matrix (Fi/Egxs) is 
parameterized as a function of the electron p, £ 5 x 5 , as well as other SG shower-shape variables. 
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Figure 14: Relative differences between data and simulation as a function of electron pj for 
different \r]\ regions, a) for the momentum scale measured using ]/xp —> e+e^, Y —)■ e+e^, 
and Z —)■ e+e^ events |91, and b) for the effective momentum resolution of Z —)■ e+e^ and 
]/tp —> e+e^ events for different electron categories. 

using simulated high-mass DY events. The parameterization is validated with data through a 
comparison of the central crystal energy with the energy estimated from the parameterization. 
The energy scale is validated at the 1-2% level using electrons with energy larger than 500 GeV 
in data. The dominant uncertainty is mainly from the limited number of high-energy electrons 
available for this study. 


5 Electron selection 

5.1 Identification 

Several strategies are used in CMS to identify prompt isolated electrons (signal), and to sepa¬ 
rate them from background sources, mainly originating from photon conversions, jets misiden- 
tified as electrons, or electrons from semileptonic decays of b and c quarks. Simple and robust 
algorithms have been developed to apply sequential selections on a set of discriminants. More 
complex algorithms combine variables in an MVA analysis to achieve better discrimination. In 
addition, dedicated selections are used for highly energetic electrons. 

Variables that provide discriminating power are grouped into three main categories: 

• Observables that compare measurements obtained from the ECAL and the tracker 
(track-cluster matching, including both geometrical as well as SC energy-track mo¬ 
mentum matching). 

• Purely calorimetric observables used to separate genuine electrons (signal electrons 
or electrons from photon conversions) from misidentified electrons (e.g., jets with 
large electromagnetic components), based on the transverse shape of electromag¬ 
netic showers in the ECAL and exploiting the fact that electromagnetic showers are 
narrower than hadronic showers. Also utilized are the energy fractions deposited 
in the HCAL (expected to be small, as electromagnetic showers are essentially fully 
contained in the ECAL), as well the energy deposited in the preshower in the end- 



























26 


5 Electron selection 


caps. 

• Tracking observables employed to improve the separation between electrons and 
charged hadrons, exploiting the information obtained from the GSF-fitted track, and 
the difference between the information from the KF and GSF-fitted tracks. 


An example of the purely-tracking variable /brem was given in Fig. Figure shows ex¬ 
amples of EGAL-only and track-cluster matching variables. The simulated signal consists of 
reconstructed electrons compatible with those generated from Z —> e+e^ decays, using a run- 
dependent version of the simulation. The data are electrons reconstructed in a sample dom¬ 
inated by Z —> e+e^ events. To achieve sufficient purity in data, a stringent requirement of 
|?Me+e- — w^zl < 7.5 GeV is made again in data and in simulation, on the invariant mass of 
the two electrons. Both electrons are required to be isolated: for each electron, the scalar sum 
of the transverse momenta of the PF candidates in a cone around its direction (excluding the 
electron) is required to be <10% of the electron pj. The background sample consists of misiden- 
tified electrons from jets in Z+jets data. This sample is selected by requiring a pair of identified 
leptons (electrons or muons) with an invariant mass compatible with that of the Z boson. To 
suppress the contribution from events with associated production of W and Z bosons, the im¬ 
balance in the transverse momentum of the event is required to be smaller than 25 GeV (which 
also suppresses tf events). One additional electron candidate must be present in the event, 
which is required not to be isolated by inverting the selection used for signal. In the e+e^+jets 
events, the invariant mass of the dielectron pair with one misidentified-electron candidate and 
an electron of opposite sign from the Z —> e+e^ decay must be greater than 4 GeV, in order 
to reject contributions from lower-mass resonances. As a consequence of these requirements, 
the control sample consists largely of events with one Z boson and one jet that is misidenti- 
fied as the additional electron. All signal and background electrons are also required to have 
Pj > 20 GeV and satisfy some simple criteria to reject electrons from photon conversions. 


The distance Ap, previously defined in Section [44} is shown in Figs. 15 a) and b). The agreement 
between data and simulation is very good for electrons in the barrel. Disagreement is observed 
in the endcaps, which is related to the mismodelled material in simulation. The Ap indeed 
increases with the amount of bremsstrahlung, which for the endcaps is somewhat larger in 
data than in simulation. 


The lateral extension of the shower along the p direction is expressed in terms of the variable 
which is defined as ~ The sum runs over the 5x5 matrix of 

crystals around the highest Ej crystal of the SG, and Wi is a weight that depends logarithmi¬ 
cally on the contained energy. The positions pi are expressed in units of crystals, which has the 
advantage that the variable-size gaps between EGAL crystals (in particular at modules bound¬ 
ary) can be ignored. The variable drjrj is shown in Figs. 15 c) and d). The discrimination power 


of (7,jrj is greater than the analogous variable in (p, because bremsstrahlung strongly affects the 
pattern of energy deposition in the EGAL along the (p direction. A small disagreement between 
data and simulation is visible in the barrel, and is mainly due to the limited tuning of elec¬ 
tromagnetic showers in simulation (improved in Geant4 Release 10.0 1371). For electrons in 
the endcaps, the main factor determining the resolution of the shower-shape variables is the 
pileup. Since this is well described in the run-dependent version of simulation, the agreement 
between data and simulation in these plots is regarded as quite good. 


Finally, Figs. 15 e) and f) show the distributions in 1/Esc ~ 1/P/ where Esc is the SG energy 
and p the track momentum at the point of closest approach to the vertex. Good agreement is 
observed between data and simulation both in the barrel and in the endcaps. In all cases, the 
distributions for signal and background electrons are well separated. 






5.1 Identification 


27 


19.7 fb ' (8TeV) 



19.7fb-^ (8TeV) 



19.7 fb ' (8TeV) 



19.7fb-^ (8TeV) 



19.7 fb ' (8TeV) 




Figure 15: Distributions in the distance At] between the position of the SC and the track extrap¬ 
olated to the point of closest approach to the SC are shown for a) the barrel and b) the endcaps. 
Distributions in the shower-shape variable (7;^^, defined in the text, are shown in c) and d). Dis¬ 
tributions in energy-momentum matching 1 / Esc — 1 / p, as defined in the text, are shown in e) 
and f). Distributions are shown for electrons from Z —> e+e^ data (dots) and simulated (solid 
histograms) events, and from background-enriched events in data (triangles). All distributions 
are normalized to their respective areas of the Z —> e+e“ data. (See text for details on the 
samples composition.) 
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To maximize the sensitivity of electron identification, several variables are combined using 
the "boosted decision tree" (BDT) algorithm f2S\ . The set of observables in each category is 
extended relative to the simpler sequential selection as follows: the track-cluster matching ob¬ 
servables are computed both at the ECAL surface and at the vertex, the SC substructure is 
exploited, more information related to the cluster shape is used, as well as the /brem fraction. 
Similar sets of variables are used for electrons in the barrel and in the endcaps. Two types of 
BDT are defined that depend on whether the electron passes HLT identification requirements 
("triggering electron") or does not ("not-triggering electron"). For triggering electrons, loose 
identification and isolation requirements are applied as a preselection, to mimic the require¬ 
ments applied at the HLT. Dedicated training then can exploit the variables discriminating 
power at best in the remaining phase space. In the following, results are presented just for not- 
triggering electrons, since the training and performance of the two algorithms are similar. The 
BDT is trained in several bins of pj and rj. To model the signal, reconstructed electrons are used 
when they match electrons with px in the range between 5 and 100 GeV in generated events. 
The background is modelled using misidentified electrons reconstructed in W+jets events in 
data. The distribution of variables in these training samples is found to be in agreement with 
the one observed in the samples used in the analyses. The signal and background BDT output 
distributions are compared in Fig. 16 where there is also a comparison given between data and 
simulation for signal electrons. The same selections are used as in Fig. and the same signal 
and background samples. The discriminating power of the BDT algorithm is evident, and the 
agreement between data and simulation is good. The small difference observed is due to the 
differences in input variables, which were described in the previous paragraphs. 


19.7fb'^ (8TeV) 



Not-triggering BDT 


19.7 fb'^ (8TeV) 



Not-triggering BDT 


Figure 16: Output of the electron-identification BDT for electrons from Z —> e+e^ data (dots) 
and simulated (solid histograms) events, and from background-enriched events in data (trian¬ 
gles), in the ECAL a) barrel, and b) endcaps. All the distributions are normalized to the area of 
the respective Z —> e+e^ data. (See text for details on the samples composition.) 

The results on the performance of the BDT-based and the sequential electron-identification 
algorithms for four selected working points are compared in Fig. for electrons with pj > 
20 GeV. Signal electrons from Z —> e+e^ events in a simulated sample are compared with 
misidentified electrons from jets reconstructed in data. The same selections and samples are 
used as in Fig.[^ As expected, better performance is obtained when the variables are combined 
in an MVA discriminant such as the BDT. In the ECAL barrel and endcaps, a working point of 
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Background efficiency Background efficiency 

Figure 17: Performance of the electron BDT-based identification algorithm (red dots) compared 
with results from working points of the sequential selection (only the identification part) for 
electron candidates in the ECAL a) barrel, and b) endcaps. (See text for details on the samples 
composition.) 

the sequential selection with respective efficiency for signal electrons of about 90% and 84%, 
has an efficiency of about 7% and 9% on background electrons. For the same signal efficiency, 
the misidentification probability using the BDT algorithm is reduced by about a factor of two. 

Although the focus of the analysis thus far has been on electrons with pj > 20 GeV, this iden¬ 
tification strategy is also adopted at smaller pj. The agreement between data and simulation 
in the px range between 7 and 15 GeV was studied using electrons from ]/xp meson decays. As 
an illustration, Fig.j^shows a comparison between data and simulation for two variables, us¬ 
ing events with both electrons in the barrel, and the run-dependent version of simulation. The 
remnant background is subtracted statistically, using the sPlot technique 11381 , through a fit to 
the dielectron invariant mass. The agreement between data and simulation is very good both 
for variables such as cr,jtj in Fig. j^a), but also for more complex ones, such as the BDT output 
shown in Fig.p^b). 

5.2 Isolation requirements 

A significant fraction of background to isolated primary electrons is due to misidentified jets 
or to genuine electrons within a jet resulting from semileptonic decays of b or c quarks. In both 
cases, the electron candidates have significant energy flow near their trajectories, and requiring 
electrons to be isolated from such nearby activity greatly reduces these sources of background. 
The isolation requirements are separated from electron identification, as the interplay between 
them tends to be analysis-dependent. Moreover, the inversion of isolation requirements, in¬ 
dependent of those used for identification, provides control of different sources of such back¬ 
grounds in data. 

Two isolation techniques are used at GMS. The simplest one is referred to as detector-based 
isolation, and relies on the sum of energy depositions either in the EGAL or in the HGAL 
around each electron trajectory, or on the scalar sum of the px of all tracks reconstructed from 
the collision vertex. These sums are usually computed within cone radii of AR = 0.3 or 0.4 
around the electron direction, and remove contributions from the electron through smaller 
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Figure 18: Distribution of a) the shower-shape variable defined in the text, and b) the 
output of the BDT electron identification algorithm for electron candidates in the ECAL barrel, 
in data (symbols) and simulation (histograms). A statistical subtraction of the background is 
applied using the sPlot technique. (See text for details.) 


exclusion cones. This procedure, which has good performance in rejecting jets misidentified as 
electrons, is used by the HLT, and in certain analyses in which just mild background rejection 
suffices. 


Most of the offline analyses, however, benefit from the PF technique for defining isolation 
quantities. Rather than using energy measurements in independent subdetectors, the isola¬ 
tion is defined using the PF candidates reconstructed with a momentum direction within some 
chosen cone of isolation. In this way, the correct calibration can be used, and a possible double¬ 
counting of energy assigned to particle candidates is avoided. When an electron candidate is 
misidentified by the PF as another particle, it enters the isolation sum, and artificially increases 
the size of the isolation observable. This effect increases when the identification efficiency of 
the PF decreases. Electron-candidate identification using PF performs very well for electrons 
in the ECAL barrel, where no additional corrections for removing electron contributions to the 
isolation sum are needed. However, in the endcaps, and in the version of the reconstruction 
used for the results discussed in this paper, the electron identification applied through the PF 
is not fully efficient. Therefore, in line with what is done in the detector-based approach, veto 
cones are applied for charged hadrons and photons when the isolation sums are computed. 

A comparison between the performance of the two techniques is given in Fig. [^for electrons 
with pj > 20GeV (with no pileup correction applied). Signal electrons from Z —?► e+e“ events 
in a simulated sample are compared with misidentified electrons from jets reconstructed in 
Z+jets data. The run-dependent version of the simulation is used. A loose identification is ap¬ 
plied in reconstructing PF electrons, and only the electron candidates that pass this selection are 
considered in performing a meaningful comparison. Better performance is obtained when the 
information from all detectors is combined using the PF technique, especially in the endcaps. 

The PF isolation is defined as 


IsopF = -h max 0, Y Pt 


^neutral had 


PU 




( 2 ) 


where the sums run over the charged PF candidates, neutral hadrons and photons, within a 
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Figure 19: Performance of the detector-based isolation algorithm (red squares) compared with 
that using PF (blue triangles) in the ECAL a) barrel, and b) endcaps. (See text for the definition 
of the samples.) 

chosen AR cone around the electron direction. The charged candidates are required to orig¬ 
inate from the vertex of the event of interest, and pj^ is a correction related to event pileup. 
The isolation-related quantities are among the observables most sensitive to the extra energy 
from pileup interactions (either occurring in the same or earlier bunch crossings), which spoils 
the isolation efficiency when there are many interactions per bunch crossing. The contribution 



of the PF-based isolation is also shown as a function of the number of reconstructed proton- 
proton collision vertices. The charged component of the isolation becomes independent of 
pileup once only candidates compatible with the vertex of interest are considered. For both p 
and the neutral component of the isolation, the dependence is almost linear. The effective area 
Agff in (//, (p) is defined, for each component of the isolation, by (AR)^, scaled by the ratio of 
the slopes for p and for the considered component shown in Fig. |^a). Once the correction is 
applied to the neutral components, the dependence on the number of vertices is much reduced, 
as shown in Fig.|^b). The plots refer to electrons with |//1 < 1, but similar conclusions hold in 
any range of f]. 

Figure|^shows the distributions of the Isopp variable divided by the electron pj, for signal and 
background electrons, after the correction for pileup contributions. For signal electrons, both 
data and simulation are shown. The samples and selection criteria presented in Section [5T] are 
used without the isolation requirement which is replaced by a loose selection on the BDT iden¬ 
tification discriminant. Excellent discrimination is observed between signal and background, 
and there is also good agreement between data and simulation. The remnant discrepancy in 
the endcaps is mostly due to the difference of the PF electron identification efficiency in data 
and in simulation, which is reflected in different contributions from misidentified particles to 
the isolation sums as discussed above. This difference is not completely recovered through the 
use of the additional exclusion cones. 
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Figure 20: Average energy density as a function of the number of reconstructed proton-proton 
collision vertices, for electron candidates with pj > 20GeV and \r]\ < 1 from data dominated 
by Z —> e+e^ events. The energy density p (open dots) is shown, along with each component 
of the particle isolation: a) neutral particles (red dots) and charged particles associated with the 
vertex (blue dots), and b) before (pink dots) and after (green dots) the correction for pileup on 
PF isolation. 
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Figure 21: Distributions of PF isolation divided by electron pj, after applying the pileup cor¬ 
rection discussed in the text, for electrons from Z —> e+e^ data (dots) and simulated (solid 
histograms) events, and from background-enriched events in data (triangles), in the ECAL a) 
barrel, and b) endcaps. (See text for more details on the compositions of the samples.) 
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5.3 Rejection of converted photons 

An important source of background to prompt electrons arises from secondary electrons pro¬ 
duced in conversions of photons in the tracker material. 

To reject this background, CMS algorithms exploit the pattern of track hits. When photon con¬ 
versions take place inside the volume of the tracker, the first hit on electron tracks from the 
converted photons is often not located in the innermost layer of the tracker, and missing hits 
are therefore present in that region. For prompt electrons, whose trajectories start from the 
beamline, no missing hits are expected in the inner layers. In addition to the missing hits, pho¬ 
ton conversion candidates can also be rejected using a fit to the reconstructed electron tracks. 
Since the photon is massless, and momentum transfer is in general small, the conversions have 
a well defined topology, with tracks that have essentially the same tangent at the conversion 
vertex in the (r, cp) and (r, z) planes. The strategy for rejecting these candidates consists of 
fitting the track pairs to a common vertex, incorporating this topological constraint, and then 
rejecting the converted photon candidates according to the x.^ probability of the fit. Also, the 
impact parameters (ip) of the electron, such as the transverse (do) and longitudinal (dz) dis¬ 
tance to the vertex at the point of closest approach in the transverse plane, or the ratio of the 
uncertainties in the three-dimensional impact parameter relative to its value (cTip/ip) are used 
to reject secondary electrons. 

Overall, when the requirement of no missing hits together with a selection on the probabil¬ 
ity of the described fit to a common vertex are applied, the inefficiency for prompt electrons 
in a simulated Z —> e+e^ sample is of the order of a percent. The rejection factor computed 
for the background data described in the previous paragraphs is about 45%. These perfor¬ 
mance figures depend strongly on the selections applied to define the electron candidates, 
since that affects the background composition, and therefore the fraction of photon conver¬ 
sions. The quoted numbers refer to electron candidates passing the "MVA selection" detailed 
in Section [ 5 !^ without using the selection based on the number of missing hits. 

The algorithms described above are used in combination with other selection variables dis¬ 
cussed in the next section to select prompt electrons. 

5.4 Reference selections 

Scientific analyses must balance efficiency and purity, depending on the levels of signal and 
background, by defining their own electron selections through a combination of different al¬ 
gorithms. This subsection summarizes some of the basic selections used widely at CMS. The 
efficiency and misidentification rates, along with a discussion of a tag-and-probe method used 
to check the performance, are given in Section]^ 

The sequential selection applies requirements on five identification variables among those dis¬ 
cussed previously: Arj, Acp, H/Esc, cr^tj, and l/£sc “ l/pin- In addition, a selection is also 
applied on the combined PF isolation relative to the electron j)j, and on the variables used to 
reject converted photons. Finally, the impact parameters of the electron, do and dz, are required 
to be small for the electron to originate from the vertex of interest. The sequential selection, ini¬ 
tially developed for measuring the W boson and Z boson cross sections, is still used in standard 
model analyses, where the yield of signal is not too small so that the efficiency is not the most 
important issue. Three working points were originally designed to have average efficiencies 
of about 90, 80, and 70% for electrons from Z —> e+e^ events, and were optimized separately 
for electrons in the ECAL barrel and endcaps. For the analysis of 8 TeV data, four working 
points are defined: loose, medium, tight, and a very loose point for analyses aiming at vetoing 
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electrons. The selections corresponding to the medium working point are given in Table 

Table 6 : Requirements corresponding to the medium working point of the sequential selection 
for electrons in the ECAL barrel and endcaps. At most one missing hit is allowed. 


Variable 

Upper value, barrel 

Upper value, endcaps 

|Ap| 

0.004 

0.007 

|A<^| 

0.06 rad 

0.03 rad 

H/Esc 

0.12 

0.10 

D;)/ 

0.01 

0.03 

|l/Esc-l/p| 

0.05 GeV^^ 

0.05 GeV^i 

IsopF (AR=0.3) / px 

0.15 

0.15 

\do\ 

0.02 cm 

0.02 cm 

\dz\ 

0.1 cm 

0.1 cm 

Missing hits 

1 

1 

Gonversion-fit probability 

10 -^ 

10-6 


The MVA selection combines requirements on the output of the identification BDT described 
in Section [5^ on the combined PF isolation, and on rejection variables for photon conversion. 
The example discussed in this paper is the selection used in the search for the H —> ZZ* —> M 
process |19|, which exploits the BDT optimized to identify electrons that are not required to pass 
the trigger selection. In the training, the BDT for these not-triggering electrons does not use 
any variables related to electron impact parameters, or variables used to suppress conversions. 
Therefore such variables can be exploited in scientific analyses. For the H —)■ ZZ* —> 4^ analy¬ 
sis, a requirement on the significance of the three-dimensional impact parameter |( 7 ip/ip| < 4 is 
applied, and the number of missing hits is required to be at most 1. The combined Isopp/ px is 
required to be less than 0.4 in a cone of AR = 0.4. The selection is optimized in six categories of 
electron pj and p to maximize the expected sensitivity, using two px ranges (7 < px < 10 GeV, 
and px > 10 GeV), and three |p| regions (|p| < 0.80, 0.80 < |p| < 1.48, and 1.48 < |p| < 2.50), 
corresponding to two regions in the barrel with different amounts of material in front of the 
EGAF, and one region in the endcaps. The MVA selection is used mainly in analyses that re¬ 
quire high efficiency down to low px, as well as sufficient background rejection. Examples of 
such analyses are the Higgs boson searches in leptonic final states. 

In addition, GMS has developed a specialized algorithm for the selection of high-px electrons 
(HEEP, i.e. High Energy Electron Pairs). Variables similar to those in the sequential selection are 
used to select large-px electrons, starting at 35 GeV and up to about 1 TeV. The main difference 
is the usage of the detector-based isolation instead of PF isolation (the two algorithms offer 
similar performance). Also, in the barrel, the ratio of the energy collected in n x m arrays of 
crystals (either Eixs/Esxs or E 2 x 5 /£ 5 x 5 ) is used, since this is found to be more effective at 
high Px than using This selection was adopted in many of the searches for exotic particles 
published by the GMS experiment, e.g. Ref. flOl . 


6 Electron efficiencies and misidentification probabilities 

A method based on the tag-and-probe (T&P) technique [I 42| exploits Z/ 7 * —> e+e^ events in 
data to estimate the reconstruction and selection efficiencies for signal electrons. The method 
requires one electron candidate, called the "tag", to satisfy tight selection requirements. Dif¬ 
ferent criteria are tried to define the tag electron, and it is found that the estimated efficiencies 
are almost insensitive to any specific definition of the tag. For the results in this paper, tag 
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electrons are required to satisfy pj > 25 GeV and the tight working point of the sequential se¬ 
lection or, for analyses involving very high-px electrons, to satisfy px > 35 GeV and the HEEP 
selection. A second electron candidate, called the "probe", is required to pass specific criteria 
that depend on the efficiency under study The invariant mass of the two electrons is required 
to be within a window around the Z boson mass of 60 < m^+g- < 120 GeV, which extends 
sufficiently far from the peak region to enable the background component to be extracted in 
the fit, and which is matched to the window used by the analyses that rely on this method. A 
requirement for having leptons of opposite charge can also be enforced. When more than two 
tag-probe matches are found, they are all used in the procedure to minimize possible biases 
produced by some specific choice. 

The number of probes passing any chosen selection is determined from fits to the invariant 
mass distribution that include contributions from signal and background. Different models can 
be used in the fit to disentangle the two components. In absence of a kinematic selection on the 
tag-and-probe candidates, the background component in the mass spectrum is well described 
by a falling exponential. However, the kinematic restrictions on the Z candidates in each px and 
rj range of the probe candidate distort the mass spectrum in a way which is well described by 
an error function. Gonsequently, the background component of the mass spectrum is described 
by a falling exponential multiplied by an error function. In the fits, all parameters of the expo¬ 
nential and of the error function are allowed to float. The fit to the signal component can use 
analytic expressions, or be based on templates from simulation. When using analytical func¬ 
tions, a Breit-Wigner function with the Z boson mass and natural width taken from Ref. Il3^ is 
convolved with a Grystal Ball function that acts as the resolution function, and multiplied by 
a falling exponential function, to model the signal in the mass region between 60 and 70 GeV. 
If a template from simulation is used, the signal part of the distribution is modelled through a 
sample of simulated electrons from Z —> e+e^ decays, convolved with a resolution function to 
account for any remnant differences in resolution between data and simulation. In all cases, a 
simultaneous fit is performed for events where probes pass or fail the requirements, to account 
for their correlation. An alternative to fitting is the subtraction of the background contribution 
using predictions from simulation or techniques based on control samples in data. This is the 
case of the HEEP selection efficiency, as detailed in the following. 

The same T&P technique is applied to data and simulated events to compare efficiencies, and 
to evaluate the data-to-simulation ratios. In many analyses, these scaling factors are applied as 
corrections to the simulation, or are used in computing systematic uncertainties. The efficiency 
in simulation is estimated from Z —> e^e“ signal samples that contain no background. A 
geometrical match with generated electrons is often requested to resolve ambiguities that may 
arise, mainly at low pj. In data, the events used in the T&P procedure are required to satisfy 
HET paths that do not bias the efficiency under study. Eor the reconstruction efficiency, only 
triggers requiring one electron and one SG are used, where the tag is matched to the trigger- 
electron candidate and the probe is matched to the trigger SG. Eor selection efficiencies, triggers 
requiring two electrons with requirements that are less restrictive than those under study can 
also be used. In such cases, the offline tag and probe are requested to match a trigger-electron 
candidate. 

The fits are performed in t] and pj bins, and an example of a fit to data is shown in Eig. The 
fits to probe electrons that pass or fail the selections are shown, respectively in a) and b). The 
signal in the mass region between 60 and 70 GeV corresponds to contributions from 7 * events, 
from final state radiation, and from poorly measured electrons, essentially located in the EGAE 
cracks. 
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19.7 fb'^ (8 TeV) 19.7 fb'^ (8 TeV) 




Figure 22: Example of fits to dielectron invariant mass distributions for probe electrons with 
10 < Pt < 15 GeV in the ECAL barrel that a) pass or b) fail the selections on isolation and im¬ 
pact parameter of the MVA selection used in Ref. |9|. Fits are shown for the signal+background 
hypothesis (full line), and for the background component alone (dashed line). 


Several sources of systematic uncertainty are considered in the fits. The main uncertainty is 
related to the model used in the fit, and is estimated by comparing alternative distributions 
for signal and background, in addition to comparing analytic functions with templates from 
simulation. Only a small dependence is found on the number of bins used in the fits and on the 
definition of the tag, such as on the reweightmg of the simulation to match the pileup in data. 
Different event generators are also compared in the analyses, and the differences among them 
are found to be negligible. 

The results discussed in the next paragraphs illustrate the method applied to several reference 
selections, and the performance that is reached. 


6.1 Reconstruction efficiency 


The reconstruction efficiency is computed as a function of the and rj of the SC, and covers 

all reconstruction effects. The SC reconstruction efficiency for > 5 GeV is close to 100%. 
To illustrate the nature of the results, the electron reconstruction efficiencies measured in data 
and in DY simulated samples are shown in Fig. 23 together with the data-to-simulation scale 
factors, as a function of Ej^, for a) \r]\ < 0.8, and b) 1.57 < |//1 <2. 


The efficiencies are found to be >85% for Ej^ > 10 GeV, for all tj. They are compatible in 
data and simulation, giving scale factors consistent with unity almost in the entire range. The 
uncertainties shown on the plots correspond to the quadratic sum of the statistical and sys¬ 
tematic contributions, dominated by the systematic components, at the level of a few percent 
for Ej^ < 20 GeV and decreasing to <1% as Ej^ increases. The main uncertainty is related to 
the fitting function. The background contamination is large in the estimation of reconstruction 
efficiency, and additional requirements are therefore applied, such as requiring the imbalance 
in px in the event to be <20 GeV. Also, the probe must be isolated, which requires the scalar pj 
sum of all tracks from the vertex of interest that fall into the isolation cone to be <15% of the 
probe Ej^. The impact of changing the definitions of these extra requirements corresponds to 
the second-highest source of systematic uncertainty in this measurement. 
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Figure 23: Electron reconstruction efficiency measured in dielectron events in data (dots) 
and DY simulation (triangles), as a function of the electron for a) \f]\ < 0.8, and b) 
1.57 < \rj\ <2. The bottom panels show the corresponding data-to-simulation scale factors. 
The uncertainties shown in the plots correspond to the quadratic sum of the statistical and 
systematic contributions. 


6.2 Selection efficiency 

The selection efficiency is computed for reconstructed electrons in bins of the electron pj and 
of the rj of the SC. For the sequential selection, the efficiencies of the medium working point 
in data and in simulation are presented as a function of electron pj in Fig. for a) |//| < 0.8, 
and b) 1.57 < \r]\ <2. The corresponding data-to-simulation scale factors are shown in the 


bottom panels. Similarly, Figs. 24 c) and d) show the efficiencies as a function of pj for the 


BDT selection, discussed in the previous section. The selections are optimized respectively 
for Pj > 10 GeV and pj > 7 GeV, which are the ranges shown in the plots. In general, data 
and simulation agree well. The scale factors are compatible with unity, with the exception 
of the low-px region (7 < px < 15 GeV), where they can be as low as 0.85-0.90 depending 
on the selections. The uncertainties shown include contributions from both the statistical and 
systematic sources. They are again dominated by systematic contributions, which are at the 
level of several percent for pj < 20 GeV, and decrease below 2% when pj increases, with the 
exception of the transition region between the barrel and the endcap. As for reconstruction 
efficiencies, the main uncertainty originates from the choice of the fitting function. It is verified 
that efficiencies are almost uniform as a function of the number of reconstructed interaction 
vertices. As expected, the less restrictive the selection, the smaller is the remnant dependence 


on pileup. For the working points illustrated in Fig. 24 the efficiencies decrease only by about 
5% and 2% for up to 50 primary vertices, meaning that the proposed selections are almost 
independent of pileup. The average number of proton-proton interactions per bunch crossing 
is about 21 in the 8 TeV data. 


For the HEEP selection, the efficiency is computed by subtracting the background contribution 
estimated from simulation, instead of using a fit. This is done especially because of the small 
number of events at large pj in data. Multijet production, which is among the dominant con¬ 
tributions to the backgrounds to Z+jets, is estimated directly from data using the jet-to-electron 
misidentification probabilities measured in a dedicated control sample. The measured uncer- 
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Figure 24: Efficiency as a function of electron pj for dielectron events in data (dots) and DY 
simulation (triangles), for the medium working point of the sequential selection in a) \rj\ < 0.8, 
and b) 1.57 < \f]\ < 2; and for the MVA selection used in Ref. [91 in c) |?/| < 0.8, and d) 
1.57 < \f]\ < 2. The corresponding data-to-simulation scale factors are shown in the bottom 
panels of each plot. The uncertainties shown in the plots correspond to the quadratic sum of 
the statistical and systematic contributions. 
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tainty of about 40% in the estimated background is the main source of systematic uncertainty 
The efficiency of the HEEP selection in data and in simulation is shown as a function of electron 
Pj in Eig.|^ together with the data-to-simulation scale factors. Because of the limited number 
of events, only two tj bins are considered, corresponding to the ECAL barrel and endcaps. The 
Pj region is restricted to pj > 35 GeV, and a wider pj range is covered in the barrel because of 
the presence of more events there than in the endcaps. In the barrel, the efficiency ranges from 
85 to 95%, and the data-to-simulation scale factors are compatible with unity. In the endcaps, 
the fluctuations are larger, with efficiencies ranging from about 80 to close to 100%. The un¬ 
certainties shown in the plots correspond to the quadratic sum of the statistical and systematic 
contributions. Eor electrons with pj < 100 GeV, the uncertainty is dominated by systematic 
sources, since this is the region where the background is more important, while above about 
100 GeV the statistical uncertainty dominates. 
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Eigure 25: Efficiency of the HEEP selection as a function of electron pj for dielectron events 
in data (dots) and DY simulation (triangles) in the EGAE a) barrel, and b) endcaps. The un¬ 
certainties shown in the plots correspond to the quadratic sum of the statistical and systematic 
contributions. 


6.3 Misidentification probabiiity 


To each efficiency corresponds a misidentification probability, defined as the fraction of back¬ 
ground candidates reconstructed as electrons that pass some set of selection criteria. The re¬ 
sults have their misidentification probability computed using data enriched in Z bosons that 
also contain an additional electron, as described in Section [5dl 


The fraction of events in which additional reconstructed electron candidates from background 
contributions pass the medium working point of the sequential selection is shown in Eig. 26 a) 
as a function of the candidate pj. The same fraction is shown in Eig.[^b) for the MVA selection. 
The uncertainties shown in the plots correspond to just the statistical contributions. In both 
cases, the misidentification probability increases with the pj of the candidate. Eor the working 
point of the sequential selection, it ranges from 1 to 3.5%, depending on pj and on the region 
of the detector. Eor the MVA selection, the chosen working point [P] is less restrictive and the 
misidentification probability is therefore larger (from 1 to 10.5%). 


The main source of systematic uncertainty in the misidentification probability is related to the 
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Figure 26: Misidentification probability, measured in data as described in the text, as a function 
of the electron pj in the barrel (red dots) and endcaps (blue dots) for candidates passing a) 
the medium working point of the sequential selection, and b) the working point of the MVA 
selection used in Ref. |!9^'|. The uncertainties shown in the plots correspond to just the statistical 
contributions. 


composition of the sample used to extract its value. For this particular choice, it is mainly 
related to the contamination from processes with genuine electrons, such as the associated 
production of W and Z bosons, and tt events. The selection on the imbalance in transverse mo¬ 
mentum strongly reduces such contamination, and therefore the systematic uncertainty, with 
the consequence that the main uncertainty in the analyses comes from the difference between 
the samples used to extract the misidentification probability and the one to which the result is 
applied. This is strongly analysis-dependent and therefore not discussed further. 


7 Summary and conclusions 

The performance of electron reconstruction and selection in CMS has been studied using data 
collected in proton-proton collisions at = 8 TeV corresponding to an integrated luminosity 
of 19.7fb^\ 

Algorithms used to reconstruct electron trajectories and energy deposits in the tracker and 
ECAL respectively, have been presented. A Gaussian sum filter algorithm used for track recon¬ 
struction provides a way to follow the track curvature and to account for bremsstrahlung loss 
up to the entrance into the ECAL. The strategies for finding seeds for electron tracks, construct¬ 
ing trajectories, and fitting track parameters are optimized to reconstruct the electrons down 
to small Pj values with high efficiency and accuracy. Moreover, the clustering of energy in the 
ECAL and its optimization to recover bremsstrahlung photons are discussed. Dedicated algo¬ 
rithms are used to correct the energy measured in the ECAL as well as to estimate the electron 
momentum by combining independent measurements in the ECAL and in the tracker. 

The overall momentum scale is calibrated with an uncertainty smaller than 0.3% in the pj 
range from 7 to 70 GeV. For electrons from Z boson decays, the effective momentum resolution 
varies from 1.7%, for well-measured electrons with a single-cluster supercluster in the barrel, 
to 4.5% for electrons with a multi-cluster supercluster, or poorly measured, in the endcaps. The 
electron momentum resolution is modelled in simulation with a precision better than 10% up 
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to a Pt of 70 GeV. 

The performance of the reconstruction algorithms in data is studied together with those of 
several benchmark selections designed to cover the needs of the physics programme of the 
CMS experiment. Good agreement is observed between data and predictions from simulation 
for most of the variables relevant to electron reconstruction and selection. The origin of small 
remaining discrepancies is understood and corrections will be implemented in the future. 

The reconstruction efficiency as well as the efficiency of all the selections are measured using 
Z —> e+e^ samples in data and in simulation. The reconstruction efficiency in the data ranges 
from 88% to 98% in the barrel and from 90% to 96% in the endcaps in the pj range from 10 
to 100 GeV. The ratios of efficiencies of data to simulation, both for reconstruction and for 
the different proposed selections, are found to be in general compatible with unity within the 
respective uncertainties, over the full pj range, down to a pj as low as 7 GeV. Differences of up 
to 5% between data and simulation are observed in most cases, while differences of up to 15% 
are measured for a few points at small px values. 

The analysis of electron performance with data has shown that, despite the challenging condi¬ 
tions of pileup at the LHG and the significant level of bremsstrahlung in the tracker, using dedi¬ 
cated algorithms and a large number of recorded Z —> e+e^ decays provided successful means 
of reconstructing and identifying electrons in GMS. The quality of simulation at the beginning 
of the experiment was sufficiently good to require only a few adjustments to the originally 
conceived reconstruction algorithms, and also enabled quick deployment of sophisticated de¬ 
velopments, such as PF reconstruction and the use of MVA methods for electron identification 
and, later, for momentum correction. The reconstruction and selection of electrons at low px 
have been achieved with a performance level close to that anticipated at the time the detector 
was designed. These achievements, especially for low-px electrons, played an essential role in 
the discovery of the Higgs boson at GMS and in the measurement of its properties 1451 

in the H —> ZZ* —> 4£ channel. 
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